2 research outputs found
Domain Specific Memory Management for Large Scale Data Analytics
Hardware trends over the last several decades have lead to shifting priorities
with respect to performance bottlenecks in the implementations of dataflows
typically present in large-scale data analytics applications. In particular,
efficient use of main memory has emerged as a critical aspect of dataflow
implementation, due to the proliferation of multi-core architectures, as well as
the rapid development of faster-than-disk storage media. At the same time, the
wealth of static domain-specific information about applications remains an
untapped resource when it comes to optimizing the use of memory in a dataflow
application.
We propose a compilation-based approach to the synthesis of memory-efficient
dataflow implementations, using static analysis to extract and leverage
domain-specific information about the application. Our program transformations
use the combined results of type, effect, and provenance analyses to infer time-
and space- effective placement of primitive memory operations, precluding the
need for dynamic memory management and its attendant costs. The experimental
evaluation of implementations synthesized with our framework shows both the
importance of optimizing for memory performance, as well as significant benefits
of our approach, along multiple dimensions.
Finally, we also demonstrate a framework for formally verifying the soundness of
these transformations, laying the foundation for their use as a component of a
more general implementation synthesis ecosystem
Domain Specific Memory Management for Large Scale Data Analytics
Hardware trends over the last several decades have lead to shifting priorities
with respect to performance bottlenecks in the implementations of dataflows
typically present in large-scale data analytics applications. In particular,
efficient use of main memory has emerged as a critical aspect of dataflow
implementation, due to the proliferation of multi-core architectures, as well as
the rapid development of faster-than-disk storage media. At the same time, the
wealth of static domain-specific information about applications remains an
untapped resource when it comes to optimizing the use of memory in a dataflow
application.
We propose a compilation-based approach to the synthesis of memory-efficient
dataflow implementations, using static analysis to extract and leverage
domain-specific information about the application. Our program transformations
use the combined results of type, effect, and provenance analyses to infer time-
and space- effective placement of primitive memory operations, precluding the
need for dynamic memory management and its attendant costs. The experimental
evaluation of implementations synthesized with our framework shows both the
importance of optimizing for memory performance, as well as significant benefits
of our approach, along multiple dimensions.
Finally, we also demonstrate a framework for formally verifying the soundness of
these transformations, laying the foundation for their use as a component of a
more general implementation synthesis ecosystem